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Abstract — For discrete memoryless multiple-access channels, 
we propose a general definition of variable length codes with 
a measure of the transmission rates at the receiver side. This 
gives a receiver perspective on the multiple-access channel coding 
problem and allows us to characterize the region of achievable 
rates when the receiver is able to decode each transmitted 
message at a different instant of time. We show an outer bound on 
this region and derive a simple coding scheme that can achieve, 
in particular settings, all rates within the region delimited by the 
outer bound. In addition, we propose a random variable length 
coding scheme that achieve the direct part of the block code 
capacity region of a multiple-access channel without requiring 
any agreement between the transmitters. 

Index Terms — Achievable region, fountain codes, multiple- 
access channels, random coding, variable length codes. 



I. Introduction 

In this paper, we investigate the rates achievable by using 
variable length codes over a two-user multiple-access chan- 
nel. We let the codewords of each transmitter to be infinite 
sequences of input symbol^ and let the receiver decode each 
transmitted message at some desired instant of timej^ The 
transmission "rate" of each message is then defined from 
the perspective of the receiver, as the information symbols 
transmitted per channel observation at the receiver Notice 
that in the usual sense these codes are rateless (or zero-rate), 
here the transmission "rate" captures the trade-off between 
the amount of information received with the "timeliness" 
of the information. This setting can be seen as a "one- 
shot" view on the multiple-access communication problem as 
opposed to a "multi-shot" view, where each transmitter has an 
indefinite amount of information to simultaneously send to the 
receiver, which is the view traditionally considered in network 
information theory. This approach may be useful to analyze 
scenarios where synchronous users have infrequent messages 
to transmit. 

Note that a definition of rates from the perspective of 
the receivers is made in [14] and |15| to analyze broadcast 
channels where a common message is transmitted to several 

The work presented in this paper was partially supported by the National 
Competence Center in Research on Mobile Infonnation and Communication 
Systems (NCCR-MICS), a center supported by the Swiss National Science 
Foundation under grant number 5005-67322. 

'There are no feedback links, however, in an implementation one can 
imagine a weak feedback indicating when the receiver has made a decision. 

-Note that both transmitters start to send their codeword at the same instant 
of time. 



receivers. Therein, the rate for each receiver is normalized by 
the time the receiver needs to be "online" to reliably decode 
the message. In this context, it is known that if the capacity 
achieving distribution is the same for each individual link, 
the maximum achievable transmission rate over each link can 
be simultaneously achieved. A result that one can not reach 
with the classical definitions of rates and block codes. An 
effective way of achieving this when the receivers are served 
by erasure channels is to use fountain codes, such as LT codes 
fSl or raptor codes flT\. Notice that, an information theoretic 
treatment of fountain codes with a careful definition of rate is 
done in 1T31 . 

In our setting, the following argument shows that, if we 
require that the receiver decode the transmitted messages at the 
same instant of time, the set of achievable rates is the same 
for variable and fixed length codes. To the contrary assume 
that such a code exists, let E[N] be its expected length, then 
by the law of large numbers the total length of n successive 
transmissions is very likely to be less than n{E[N] + . Thus, 
a fixed length code of this length will achieve almost the 
same rate with a small probability of errorjl Therefore, the 
interesting problem is to characterize the region of achievable 
rates when the receiver is allowed to decode the messages at 
different instants of time. 

Here, we introduce a region of achievable rates that captures 
the variability in the receiver decoding times and show an outer 
bound on it. This outer bound can be related to the block code 
capacity region and quantify the possible gain over block codes 
in terms of achievable rates. Then, we present two examples of 
variable length codes obtained by combination of block codes 
that achieve any rate pair within the region delimited by the 
outer bound, in specific settings which are explicated later. 
This argues that the gain in the achievable rates using variable 
length codes comes only from the possibility for the receiver 
to decode the transmitted messages in non-overlapping periods 
of time0 

To conclude, using random coding, we show the existence 
of a variable length code that achieves all rate pairs within the 
direct part (without time-sharing) of the block code capacity 
region of a multiple-access channel, without requiring a pre- 

^This argument can be formulated for any multiple-user channel. 

"^Notice that the corresponding analysis for variable length coding over 
a degraded broadcast channel in which independent messages have to be 
transmitted to each receiver is done in (£]■ 
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vious agreement between the transmitterjl A result that one 
can not obtain using only block codes and which might be 
interesting in a decentralized setting. 

The next section provides the definition of a variable length 
code for a multiple-access channel, along with an associated 
region of achievable rates addressing the possibility for the 
receiver to decode different instants of time. In Section II, 
we show an outer bound on this region. Then, in Section 
III, we relate the outer region formed by the outer bound to 
the block code capacity region of a multiple-access channel, 
and, in Section IV, we presents two examples of coding 
schemes based on block codes that achieve the outer region 
in particular settings. Finally, in Section V, we explore the set 
of rates achievable using variable length codes with a random 
codebook and derive a decoding rule that achieves all rate 
pairs within the direct part of the block code capacity region, 
without requiring any agreement between the transmitter. 



of a complete |3^|-ary tree@ The leaves have a label from the 
set of messages. Each decoder starts climbing the tree from 
the root. At each time it chooses the branch that corresponds 
to the received symbol. When a leaf is reached, the decoder 
makes a decision as indicated by the label of the leaf (see Fig. 
|2]for an example). 



II. Definitions 

We consider a discrete memoryless multiple-access channel 
in which two transmitters send independent information to a 
common receiver The channel model is illustrated in Figure 
[T] There are two sources, one producing a message Wi G 
{l,2,...,Afi} and the other producing a message W2 G 
{1,2,..., M2}. The channel consists of two input alphabets 
Xi and X2, one output alphabet y, and a probability transition 
function p{y\xi, X2)- By the memorylessness of the channel 
we have, for any n, ) — Il2^ip{yi\xii, X2i), where 

X? G Afi", x5 G ^"2" and y" G y'\ 



Wi 



W2 



Transmitter 1 " 



^1 



Transmitter 2 " 



p{y\xi,X2) 



Receiver 



{Wl,W2 



Fig. 1. Multiple- Access Channel. 

Let Ni and N2 be stopping times with respect to 
{yi}i>i, the sequence of received letters. We define a 
{Ml, M2, Ni, N2) variable length code as two sequences 
of mappings (encoders) {xii{Wi)}i>i and {x2i{W2)}i>i, 
and two decoding functions (decoders) with respect to the 
decoding times A^i and N2, 



91 -y 



{1,2,..., Ml} 



and 



g2:y''' ^{l,2,...,M2}. 

Note that y^'^ and y'^^ take values in the set of all finite 
sequences of channel output. For deterministic stopping rules, 
we can represent the set of all output sequences for which a 
decision is made, at each decoder (gi and §2), as the leaves 

'This means that no explicit or implicit agreement is made between the 
transmitters, that is each transmitter acts as if it were alone completly ignoring 
the other one. 



Fig. 2. Example of a tree associated with gi for a binary-output 
multiple-access channel with Mi = 4. The set of all received 
sequences for which a decision is made is represented by the leaves 
of a complete binary tree. The decoder climbs the tree by going up 
or down whether it receives a one or a zero, until it reaches a leaf 
and makes a decision accordingly. 

Now, assuming that (Wi,W2) are uniformly distributed 
over {1, 2, . . . , A/i} x {1, 2, . . . , ^^2}, we let the average 
probability of error to be the probability that the decoded 
message pair is not equal to the transmitted one, i.e., 

Pe = Pr{.gi(y^i) / Wi or 52 ^ VK2 }, 

and we define the transmission rates from the perspective of 
the receivers as ^-^^ and ^^^Q Notice that this definition 
of rate is usually made for variable length coding over a single- 
user channel, see, e.g., |1|, |16|. However, this is a particular 
choice that measures the rate by the amount of information 
received over the average transmission time, one can imagine 
other definitions that may lead to different results. 

Definition 1: A rate pair (i?i, R2) is said to be achievable 
for the multiple-access channel if for all e > 0, there exists 
a (Mi,M2,iVi,iV2) variable length code with > Ri, 

> i?2 and P, < 6. 

The capacity region of the multiple-access channel is the 
closure of the set of achievable rates0 Observe that with this 
definition the capacity region is simply given by the rectangle 
[0,Ci] X [0,C2], where Ci I{Xi;Y\X2) 
and C2 = maxpj^^j-jpi-j.^) /(X2; are the supremum of 

all achievable rates in each individual link. As previously 
observed in any rate pair in this region can be achieved 
by sending the messages of each user in a separated period 
of time, and by making the ratio E[Ni]/ E[N2] approach zero 

*A tree is said to be a complete |y|-ary tree if any vertex is either a leaf 
or has |3^| immediate descendants. 

'Note that the expectation E[Ni] and E[N2] are taken over the channel 
realizations and over the pair of messages (Wi, 14^2)- 

*Here we consider the average probability of error. To use Pe = 
max„j_,„2 Pr{gi(yj^) = wi or 92(^^2) = W2\Wi = w\,W2 = W2} 
would in general lead to a different capacity region, as noticed in fff]. 
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(or infinity). Thereby requiring that one user have infinitely 
more information to transmit than the other 

As mentioned in the introduction, here we want to consider 
scenarios where each user has infrequent messages to transmit. 
Thus, we are more interested to characterize the region of 
achievable rates for bounded values of the ratio E[Ni\/ E[N2] 
and capture the variability on the receiver decoding times, this 
leads us to consider the following region: 

Definition 2: Let N = nim{Ni, N2), we denote by C.ri.r2 
the set of rates achievable by using variable length codes for 
which > ri, > r2 = sn, with < ri,r2 < 1. 

This definition precludes the possibility that the receiver 
decodes one transmitted message in a short period of time 
while the other one takes a large period of time, the ratio 
between the two average decoding times being governed by 
the values of ri and r2- The justification for the particular 
formulation of the restrictions imposed on E[Ni] and E[N2] 
comes from the outer bound that we found on this region, this 
bound is presented in the next section. Section V will then 
describe coding schemes based on block codes that achieve 
the outer region when additional constraints are imposed on 
ri and r2- 

III. Outer Region 

In order to prove our outer bound on Cri,r2 we need two 
lemmas, which gives lower bounds on the mutual information 
of interest in terms of single letter expressions. 

Lemma 1: The following inequalities hold: 

I{Wi;Y^\W2) < E[N]I{Xi;Y\X2,Q) + \og{eE[N]) 
I{W2;Y^\Wi) < E[N]I{X2; Y\XuQ) + log(e£;[iV]) 
I{Wi,W2]Y^) < E[N]I{Xi,X2;Y\Q) + log(e£;[iV]), 

for some joint distribution p{q)p{xi\q)p{x2\q)p{y\xi, 2:2). 

Proof: Let = l{iV > i}E then, from the chain rule 
for mutual information, we have 

/(W^i; y^|M^2) = /(W^i; yiAi, Ai, • • . , F„A„, A„, • • • |W^2) 

= I{Wi;Xi\W2) + /(M^i; FiAilAi, W^2) + • • • 
+ /(Wi;A„|(rA)"-\A"-\M^2) 
+ I{Wi ; r„ A„ I (r A)"-\ A" , W^2) + • • • 

00 

= Y,IiWi;X.MYXr-\y'\W2) 

00 

+ J2liWi;Y,X,\{YXy-\X\W2). 

1=1 

The first summation can be upper bounded as 

00 00 
J2 1{Wi ; A. I {YXy~\ y-\W2)<J2 ^(^^ I ^'"') 

i=l i=l 

= H{Xi,X2,---) 

^H{N) 

< \og{eE[N]), 

'where 1{A'^ > i} is equal to 1 if A'^ > i and equal to otherwise. Also, 
we define AiXi as being equal to Ai if N > i and equal to N otherwise, 
where H denotes a symbol distinct from any of the letters in {Xi, X2,y), 
and Ai can be either Xn, X2i or Yi. 



where we use the fact that conditioning reduces entropy, and 
the last inequality is proved in p4l and fS*, §1.3], for any 
non-negative discrete random variable, using the log sum 
inequality. 

For the second summation, we can write 

I{Wi;Y,K\{YXy-\X\W2) 

^ H{Y,X,\{YXy-\X\W2) - H{YA^\{YXy-\X\W2,Wl) 

< H{Y,K\X2^K, A,) - H{Y,X,\{YXy-\ X\W2, Wi) 

== H{YiXi\X2iXi, Xi) — H{YiXi\XiiXi, X2iXi, Xi) 

= Pr(A, - 1) [H{Y,\X2.,, X, = 1) - H{Y,\Xu,X2^, A, = 1)] 

= PT{N>i)I{Xu;Y,\X2^,X^ = l), 

where (a) follows, since conditioning reduces entropy and 
X2i is a function of W2. In (6) we remark that knowing 
Ai, YiXi is independent of the past values {Xj}j^i, and 
that {Xii,X2i) is a function of (Wi,W2) and then given 
{Xii,X2i), Yi is independent of (Wi,W2) and of the past 
received values. The other equalities follow by definition of 
the corresponding quantities. 

Next, observe that p{yi\xii, X2i, Xi = 1) = p{yi\xii, X2i), 
thus I{Xu;Y,\X2^,X^ = 1) - /(Xh; jXa,), with 
p{xii) = p{xii\Xi = 1) and p{x2i) = p{x2i\Xi = 1). Hence, 
we get 

oo 

Y,I{Wi;Y,K\{YXy-\X\W2) 

oo 

<Y,PliN>^)I{Xu■,Y,\X2^) 

i=l 

= E[N]f^^^^^^I{X,,;Y,\X2.y 

Now let at = ^^^p^, note that ai > for all i, and 
J2i flj = 1- Thus, we can define an integer random variable Q 
by setting Pi{Q = i) = ai, for all z G {1, 2, . . . }. Using this, 
the preceding equation becomes 

oo 

J2l{Wi;Y,X,\{YXy-\X\W2) 

oo 

= E[N]Y,P<Q = t)I{X,Q;YQ\X2Q,Q = z) 

i=l 

= E[N]I{X,;Y\X2,Q), 

where Xi = Xiq, X2 = X2Q and F = Yq are new random 
variables whose distributions depend on Q in the same way 
as the distributions of Xu, X2i and Yi depend on i. Notice 
that Q {Xi,X2) — > Y forms a Markov chain. Therefore, 
we obtain 

IiWi;Y^\W2) < E[N]IiXi;Y\X2,Q) + log{eE[N]), 
for some joint distribution p{q)p{xi\q)p{x2\q)p{y\xi, X2)- 
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The second inequality follows in a symmetric way. For 
the last one, we proceed in the same manner, consider 

I{WuW2;Y'') = I{Wi,W2;Y^\i,\u - ■ ■ , rnA„, A„, • • • ) 

oo 

+ J2HWl,W2■,Y^x^\{Yxy-\y). 

i=l 

As before, the first summation can be upper bounded as 

oo 

Y,IiWl,W2■,X^\{YXy-\X'-^) < \og{eE[N]). 

i=l 

For the second summation, we have 

I{Wi,W2;Y,K\{YXy-\X') 

= H{YA^\iYXy-\X')-H{Y,X^\iYXy-\X\Wl,W2) 

< H{Y,X,\X^) - H{Y,X^\XuK,X2^K,X,) 

= Pr(A, = 1)[H{Y,\X^ = 1) - H{Y,\Xl,,X2^,X, = 1)] 

= PT{N>i)I{Xu,X2^;Y,\X^ = 1), 

since {Xii,X2i) is a function of (Wi,W2), and given 
{Xii,X2i), Yi is independent of the past received values. 

Then, observe that p{yi\xii,X2i,Xi = 1) = p{yi\xii,X2i), 

thus I{Xu,X2^]Y,\X, - 1) = I{Xu]Y,\X2^\ with 

p{xii) = p{xii\Xi = 1) and p{x2i) = p{x2i\Xi = 1). Hence, 
we get 

oo 

J2l{WuW2;Y,X,\{YXy-\y) 

oo 

<Y,MN>i)I{Xu,X2^;Y,) 

i=l 

, , Pr(A^ >i) , 
= E[N]J2^^^IiXu,X2V,Y,y 

Now, as done before, let = ^^^gp^, and define an integer 
random variable Q by setting Pr(Q = i) = ai, for all i G 
{1,2,...}. Using this, the preceding equation becomes 

oo 

Y,HWi,W2;Y,X,\{YXy-\y) 

i=l 

oo 

= E[N]Y, Pi-(Q = i)IiX,Q,X2Q; Yq\Q = t) 
1=1 

= E[N]I{Xi,X2;Y\Q), 

where Xi = Xiq, X2 = X2Q and Y = Yq are random 
variables whose distributions depend on Q in the same way 
as the distributions of Xu, X2i and Yi depend on i. Notice 
that Q — > (Xi, X2) Y forms a Markov chain. 

Therefore, we obtain 

I(Wi,W2;Y'') < E[N]I{Xi,X2;Y\Q) + \og{eE[N]), 
for some joint distribution p{q)p{xi\q)p{x2\q)p{y\xi, X2)- □ 



We show the proof of the next lemma in appendix, the main 
ideas being presented in the previous lemma. 
Lemma 2: We have the following inequalities: 

I{Wi-Y^l^\Y'^ , W2) < E[Ni - N]Ci + \og(eE[Ni - N]) 
I{W2;Y^l,\Y'', Wi) < E[N2 - N]C2 + logieE[N2 - N]). 

Proof: See Appendix lAl □ 
Notice that in these lower bounds the additional terms 
corresponding to the information provided by the length of the 
codewords are sublinear in the average decoding times. This 
is an interesting fact that we use to show our outer bound on 
the region of achievable rates Cri,r2' given by the following 
theorem. 

Theorem 3: (Outer bound) Any rate pair {Ri, R2) £ Cri,r2 
must satisfy 

Ri < riI{Xi;Y\X2, Q) + (1 - n)Ci 
R2 < r2l{X2;Y\Xi,Q) + (1 - r2)C2 
sRi + R2< r2l{Xi,X2; Y\Q) + s{l - n)Ci + (1 - r2)C2, 

for some joint distribution p{q)p{xi\q)p{x2\q)p{y\xi,X2), 
with |Q| < 2. 

Proof: Let Wi be uniformly distributed over 
{1,2,...;M,}, i = 1,2. Then, 

= H{Wi, W2) ~ H{Wi, vK2|r'"^''(^i^^^)) 

= E[Ni]Ri + E[N2]R2 - H{Wi, pi/2|yniax(Ari.Ar,)^^ 

and 

I{Wx-Y^'\W2) = H{Wi\W2) - H{Wi\Y^\W2) 

> E[Ni]Ri - 

and 

I{W2;Y'^''\Wi) = H{W2\Wx) - H{W2\Y^\Wi) 

> E\N2\R2-H{W2\Y^^). 

From Fano's inequaUty, we have 

E\Ni\{Ri-t) <I{WuY^^\W2) 
E\X2\{R2~e) <I{W2\Y^^\W{) 
E[Ni]{Ri ^ e) + E[N2]{R2 ~ e) < /(l^i, W^2; F'"^"*^^'^^)), 

where e — > as — > 0. 

Applying the chain rule for mutual information and 
remembering that N — min(A^i, A^2), we can write 

I{Wi;Y'''\W2) - I{Wi;Y''\W2) + I{Wi;Y^^,\Y'' , W2), 

with the convention that Y^^^ — 0. 

Then, using Lemma [T] and Lemma |2] we get 

I{Wi;Y^^ \W2) < E[N]I{Xi;Y\X2, Q) + E[Ni - N]Ci 
+ \og{eE[N]) + \og{eE[Ni ^ N]), 
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for some joint distribution p{q)p{xi\q)p{x2\q)p{y\xi, X2)- 
In a symmetric way, we obtain 

I{W2;Y''' \Wi) < E[N]I{X2; Y\XuQ) + E[N2 ~ N]C2 
+ log(e£;[7V]) + log(e£;[7V2 - N]), 

for some joint distribution p{q)p{xi\q)p{x2\q)p{y\xi, X2). 

Now, using the chain rule for mutual information, we 
have 

I{Wi,W2;Y'^^'''-^''^^^) 

= I{WuW2;Y'') + I{Wi,W2;Y^tf''''''''>\N, F^). 
Lemma [T| implies 

I{Wi,W2;Y'') < E[N]I{Xi,X2;Y\Q) + \og{eE[N]), 

for some joint distribution p{q)p{xi\q)p{x2\q)p{y\xi, 2:2). For 
the second term, the following holds 

IiWuW2;Y;^lf'''''''^\N,Y^) 
= Pr{N = Ni)I{Wi, W2;Y^^\i\N = iVi, F^^) 

+ Pv{N = N2)I{Wi, W2; Y^^\^\N = N2, r^^) 
= Pr(iVi < N2)IiWi,W2;Y^^\i\N = 7Vi,y^i) 
+ Pr(iV2 < N,)I{Wi,W2;Y^;+,\N = N2,Y^'), 

with 

IiWi,W2;Y^^\,\N ^ Ni,Y^') 

^I{W2; Fj^Vi I = iVi , , Wi ) 

+ I{Wi;Y^^\,\N ^ N^^Y""') 
= I{W2;Y^^\,\N = TVi, , W,) 

+ H{Wi\N = 7Vi,r^i) 

- H{Wi\N = 7Vi,r^-^). 

Since at time iVi the receiver decodes Wi, we can apply Fano's 
inequality, yielding 

I{Wi,W2;Y^^\,\N ^ N,,Y^') 

< IiW2;Y^^\,\N = iVi, W^i) + Erne, 

where e ^> as Pg 0. By symmetry, we have 

IiWi,W2;Y^;+^\N ^ N2,Y^-) 

< IiWuY^;^^\N = N2,Y''-,W2) + E[N2]e. 



Hence, 

I{WuW2;Y;^1'1^^'^'''^\N,Y'') 

< Pr(7Vi < N2)I{W2 ; Y^^\, |iV = TVi , F^S l^i ) 
+ Pr(7V2 < Ni)IiWi ; Y^;^^\N = N2, , W2) 
+ E[Ni]e + E[N2]e 

^IiW2;Y^l,\N,Y^,Wi) 
+ IiW,;Y^]^,\N,Y^ ,W2) 
+ E[Ni]e + E[N2]e 

< E[N2 - N]C2 + E[Ni - N]Ci 

+ log(e£:[iV2 - N]) + log(e£[iVi - N]) 
^E[Ni]€ + E[N2]e, 

where we use Lemma |2] to obtain the last inequality. 

Putting things together, we get 

S[iVi](i?i -e) < E[N]I{Xi-Y\X2,Q) + E[Ni ~ N]Ci 
+ log(e£;[7V]) + log(e£;[7Vi - N]) 



E[N2]{R2 - e) < E[N]I{X2;Y\Xi,Q) + E[N2 - N]C2 
+ log(e£;[7V]) + log(e£;[7V2 - A^]) 



E[Ni][Ri ^ e) + E[N2]{R2 - 

< E[N]I{Xi,X2;Y\Q) + E[Ni - N]Ci 
+ E[N2- N]C2 + \og{eE[N]) 
+ log{eE[Ni - N]) + \og{eE[N2 - N]) 
+ E[Ni]e + E[N2]e, 

for some joint distribution p{q)p{xi\q)p{x2\q)p{y\xi, X2)- 

Dividing by -E[Afi] in the first inequality and by i?[iV2] 
in the second and in the last inequality, then letting 
£;[iVi] 00 and E[N2] -s- 00 with = s gives the 

statement of the theorem. The upper bound on the cardinality 
of Q follows from convex analysis. 

□ 

In the previous proof we let the expected decoding times be 
arbitrary large, but in regards of our definition of achievab ility 
(Definition [T]i it is not sure that this is needed in order to 
achieve an arbitrary low probability of error.0 For channels 
with a zero-error capacity equal to zero. Appendix |B] gives 
an heuristic argument showing that this is indeed required. 
However, observe that variable length codes can increase the 
zero-error capacity of a channel (for example, one can consider 
the binary erasure channel), thus this is not just a technicality. 

'"Here the concatenation argument traditionally made with block codes 
does not work. 
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IV. Comments on the Outer Region 

Let TZmac denote the block code capacity region of a 
multiple-access channel, which can be stated as the union of 
all paii's (i?i,i?2) satisfying 

Ri<I{X,:Y\X2,Q) 
R2<I{X2;Y\Xi,Q) 
Ri+R2<I{X,,X2;Y\Q), 

for some joint distribution p{q)p{xi\q)p{x2\q)p{y\xi,X2), 
with |Q| < 2. 

For a given ri and r2, let us rewrite the region defined by 
the outer bound of the previous theorem as the union of all 
{R'i,R2) pairs satisfying 

R'l < r2l{Xi;Y\X2, Q) + - r^)Ci 
R'2<r2l{X2;Y\Xi,Q) + {l-r2)C2 
R'l + R'2< r2l{Xi,X2; Y\Q) + s{l - n)C\ + (1 - r2)C2, 

for some joint distribution p{q)p{xi\q)p{x2\q)p{y\xi,X2), 
with |Q| < 2. We have just set R[ = sRi and R'2 = R2 in the 
region of the theorem. Denote it by TZ. From these expressions, 
we have immediately that {R[, R'2) G 7?. is equivalent to 

- - s(l - ri)Ci,R'2 - (1 - r2)C2) e Umac- 

Therefore, the region given by Theorem |3] can be seen as 
a contraction by (ri, r2) of the block code capacity region of 
a multiple-access channel followed by an extension of ((1 — 
ri)Ci, (1 — r2)C2). This is illustrated in Fig. [3] One can also 
remark that, when (ri,r2) = (1, 1) the outer region is equal 
to the block code capacity region TI]\iac, and for (ri, r2) — 
(0, 0) we recover the full rectangle [0, Ci] x [0, C2]. 




Fig. 3. Example of an outer region with an arbitrary (ri,r2). The 
dashed line with ri = 1 and r2 = 1 represents the block code 
capacity region of a multiple-access channel. The dotted lines show 
the construction of the outer region. 

Finally, let us emphasize that Cri,r2 is defined for variable 
length codes with a certain ri and r2, and that no bounds on 
the possible values of these ratios are given here. However 
the existence of a coding scheme with any desired ri and r2 
is not guaranteed. In the next section we specify the outer 
region when some restriction on £'[iVi], E[N2] and E[N] are 
imposed and show explicit coding schemes that achieve the 
outer region in these particular cases. 

"For a careful definition and analysis of block codes and multiple-access 
channels, the reader is referred to [3 and the references therein. 



V. ACHIEVABILITY AND CODING SCHEMES 

Let us first restrict the analysis to coding schemes for which 
the receiver never (or with a negligible probability) decodes 
the message from the first transmitter after the message 
coming from the second transmitter, that is E[N] = E[Ni] or 
equivalently ri = 1. In this case, the outer bound of Theorem 
|3]can be written as, any rate pair R2) £ Ci_r2 must satisfy 

Ri<IiXi;Y\X2,Q) 
R2 < sI{X2;Y\Xi,Q) + {1 - r2)C2 
r2Ri + i?2 < r2l{Xi,X2;Y\Q) + (1 - r2)C2, 

for some joint distribution p{q)p{xi\q)p{x2 \q)p{y\xi, X2) with 
Q| < 20 and where < r2 < 1. 

As the following construction will show, any rate pair in 
the region delimited by this outer bound can be achieved by 
using a (sequence of) concatenation of two (multiple-access) 
block codes. For some e > 0, generate one block code of 
length E[Ni] and rates (i?^;, -R2) T^-mac, and one of length 
E[N2]-E[Ni] and rates (0, C2 -e), that is the first transmitter 
send the input symbol that allows the second transmitter to 
send at its maximum rate (see Fig. |4]l. 



^ E[Ni] E[N2] 

Fig. 4. Example of codewords formed by the concatenation of two block 
codes. The top (resp. bottom) fine illustrates the codeword of the first (resp. 
second) transmitter The filled intensity of a block representing a codeword 
is proportional to the information rate of the corresponding code. 

Denote by the (Mi, M2, ^1 , A^2 ) variable length code 
obtained by the concatenation of these two block codes, this 
means that we let the codewords be formed by the Cartesian 
product of the respective codebooks|3 and that the decoding 
functions are equal to the corresponding block code decoding 
functions with respect to the fixed stopping times and N2, 
which are given by 



N^2 = E[N2] = 



log Ml 

RI 
log Ml 

RI 



a.s. 



log Ml 



this implies that 

log Ml 

Wf] 



RI 



\0gM2 _ Nl 



logM2 

{C2-e) (C2-e) RI 



a.s. 



Thus, by letting and N2 be arbitrary large with ^ = 
r2, this coding scheme achieves any rate pair within the 
outer region. In the case where E[N] ~ E[N2], a symmetric 

'^Henceforth we will omit to mention the cardinality bound on Q. 
'^To be ligorous we should add to each codeword an infinite sequence of 
arbitrary input symbols. 



7 



construction shows that the outer region of Theorem |3] is 
achieved. Let us denote by the {Mi, M2, Nf , N^) variable 
length code corresponding to this construction. 

This shows that, in the particular cases where E[N] = 
E[Ni] or E[N] = E[N2], the best coding scheme is composed 
of two successive block codes. Hence, in this example, we see 
that the gain in terms of achievable rates essentially comes 
from the possibility for the receiver to decode each message 
at a different instant of time. 

Concerning the general case with no specific restriction on 
E[N], for a fixed value of E[Ni] and E[N2], the best outer 
bound is obtained by minimizing E[N]. Since A^i > 
and N2 > with high probabihty, we have E[N] > 



/ log Ml log 



In the remaining of this section, we 
restrict our analysis to coding schemes with — 



C2 



Ci C2 

this impose a restriction on the ratio of the expected decoding 
times. For such codes, using the lower bound on E[N], we 
have 1 > r"! > ^ and 1 > f2 > thus we may rewrite the 



outer bound on C 



ri,r2 for these values of ri and r2, as 



Ri < 



I(Xv.Y\X-2,Q) 



2 - 



i?2 < 



Co 



I{Xr,Y\Xi,Q) 
C2 



R2 



SR1+R2 < -^IiXi,X2;Y\Q) 

+ (l-^)sCi + {l-^)C2, 

for some joint distribution p{q)p{xi\q)p{x2\q)p{y\xi,X2), and 
for s — The last inequality can be worked out to bound 

i?2 by a function of 

C2 



R2 < 



2 + 2i 



I{Xi,X2;Y\Q) 



'C2 C2R1 

Thus, when Ri satisfies its upper bound with equality, R2 
must satisfy 

C2 



i?2 < 



I(X2;Y\Q) 
C2 



for some joint distribution p{q)p{xi\q)p{x2\q)p{y\xi, X2)- 

We specify now this outer region when the block code 
capacity region of the multiple-access channel forms a pen- 
tagon. Let us denote by {Ci,d2) and {di,C2) the corner 
points of the dominant face of TZmac (see Fig. |5]l. Then, let 
the joint distribution be such that the pair {I{Xi;Y\X2,Q), 
I{X2;Y\Q)) is on the dominant face of TZmac, we can 
describe any such pair by I{Xi;Y\X2,Q) — di +p{Ci — di) 
and I{X2;Y\Q) = da +p(C2 - ^2), for some p e [0,1]. 
Therefore, in this setting, the achievable rates satisfy 



Ri < 



R2 < 



-p{l- 
C2 



Ci 



l+p(l 



d2_\ ' 

C2 



(1) 



for some p e [0, 1]. Observe that the region of all rate pair 
satisfying ([T]i is not convex. 




Fig. 5. Example of a region of achievable rates for a multiple-access 
channel. The figure shows the region Cri,r2 for ^ multiple-access channel 
with a pentagon-shape capacity region and for variable length codes for which 
'°Ci^^ = ■ The dashed line delimits the achievable region using block 

codes. 



In order to achieve this bound, we consider variable length 
codes with non-deterministic encoders0The idea is to use the 
codes and in alternation. To communicate a message 
pair (wi, u'2) e {Wi, W2), with probability A, the transmitters 
use the codeword pair in corresponding to {wi,W2), and 
with probability 1 — A = A they use the corresponding code- 
word pair in V^. The codewords obtained by this procedure 
form the codebook which is revealed to the receiver (and the 
transmitters). This is a kind of "time- sharing" between the 
codes and V^, except that here the two codebooks have 
a different timeliness, and thus we cannot construct a new 
codebook with the desired rates by simply using one codebook 
a fraction of time and the other the remaining fraction of time. 
The decoding times Ni and N2 of this coding scheme satisfy 

E[Ni] = XNl + 
E[N2] = AiVj + XNl 

Now, for some e > 0, set (i?^, i?2) — (Ci — e, ^2) in the first 
block code of V\ and (i?t,i?2) = idi,C2 ~ e) in the first 
block code of V^. For i?[iVi] and E[N2] arbitrary large, this 
random coding scheme achieve the following rates 



Ri = 



(Ci-e) 



1 -I- \ logM2 (Ci-e) _ rfi ' 
^ ^ ^logAfi (C2-£)^^ iCi-e)' 



R2 



{C2 - e) 



1 j_ \ log -''-^i (C2-<s) n _ d2 ^ ' 

^ ^ ^logMa (Ci-£) {C2-e)) 



for all A e [0,1]. 

This can be related to the outer bound given by ([T]i, in 
particular for l2^Mi. = we have that any rate pair 

(i?i, R2) such that 

Ri < 
R2 < 



(Ci-e) 



1 + A(1 



(Cl-e)' 

{C2 - e) 



1 + A(1 



d2 



{C2~e)' 

for some A £ [0, 1], is achievable. Thus, any rate pair within 
the outer region is achieved in this special case, showing that 
"time-sharing" coding strategies are sufficient for this setting. 
The shape of such a region is represented in Fig. |5] Note 

'"'Note that, our setting can be extended to incorporate non-deterministic 
encoders and the outer bound on Cri,r2 still holds. 
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that one can achieve higher rates with variable length coding 
than the rates achievable with fixed length coding even when 
E[Ni] — E[N2]- This holds only because of the possibility 
for the transmitters to send a part of their message in non- 
overlapping periods of time. 

Finally we remark that these coding strategies need to fix 
the transmission rates (through the decoding times) before 
generating the codebook, thus each transmitter is aware of the 
rate used by the other transmitters. In the next section we show 
the existence of variable length codes achieving the direct 
part of the block code capacity region of a multiple-access 
channel without requiring a common agreement between the 
transmitters (decentralized setting). 

VI. Random Variable Length Codes 

In this section we analyze the rates achievable when the 
transmitters employ a random codebook, that is the sequence 
of mappings {xu{Wi)}i>i (resp. {x2i{W2)}i>i) are Mi 
(resp. A/2) random sequences of i.i.d. samples distributed 
according to a probability distribution p{xi) (resp. p{x2)) 
defined over Xi (resp. A2). 

A. A joint decoding rule 

Let each transmitter start the transmission of a uniformly 
chosen codeword in the random codebook. At time n, the 
decoder bases its decision on the sequence of received values 
y". If we constrain the decoding times A^i and N2 to be 
equal, the joint decoder that minimizes the probability of 
error will use a MAP (maximum a posteriori) rule and choose 
the messages index (wi,W2) maximizing the probability that 
{wi,W2) is transmitted knowing the received sequence. LelS 

T{n) — max Pr((wi, ?«2) is transmitted!?/"), 

then the optimal joint decoder (the one that minimizes the 
expected decoding time subject to a probability of error 
constraint) will make a decision at the time instant n for 
which r(n) exceeds a pre-determined threshold, and decode 
the messages {wi,W2) achieving the maximum in the MAP 
rule. 

Since the optimal rule is difficult to analyze, here we will 
make the hypothesis that = ^^=iP{yi), and look at the 

following modified version of the optimal decoding rulJ*^ 

p{y"\x1{wi),x^{w2)) 



Let us denote the expression under the summation by 



T{n) 



max 



max n" 



p(y") 

Piyt\xiiiwi),X2iiw2)) 



P{V^ 



taking the logarithm, we obtain 



^jointyn) = max > log -— 

wt,w2-^ pijji) 



X2i{w2)) 



'^The fact that the decoder knows the reahzation of the random codewords 
is imphcit in the definition of T{n). 

This assumption holds, since the channel is memoryless and we use 
a random codebook with i.i.d. samples, but here the decoder knows the 
realization of the random codewords and so he can compute all the conditional 
probabilities such as p{y"\xKl), . . . ,x'^{Mi),x^[l), . . . ,x'^(M2)), to 
obtain the true MAP rule. 



Zi{wi,W2) = log 



Piyi\xii{wi),X2iiw2)) 

p{yt) 



and the summation by S{n,wi,W2) = ^ii'^^''^^)- 
Note that for a fixed pair {wi,W2), {Zi{wi,W2)}i>i is a 
sequence of i.i.d. random variables, and {S{n,wi,W2)}n>i 
is a random walk. Therefore, the joint decoder will declare 
the message pair {wi,W2) corresponding to the first (among 
M1M2) random walk that crosses a given threshold (see Fig. 

Let us consider the following threshold (1 + e) log(MiM2) 
with e > 0, then is the stopping time defined by 

N = mm{n > 1 : 5jo™t(n) > (1 + e) log(A/iAf2)}. 



S{n,wi,W2) 
(l + e) log(MiM2) 




Fig. 6. Illustration of joint decoding with M1M2 = 4. Each trace 
represents a random walk {S{n,wi,W2)}n>i corresponding to a message 
pair (uii,ui2). As soon as a random walk crosses the threshold given by 
(1 -|- e) log(MiAf2), the decoder declares the corresponding message pair 

Assume, without lost of generality, that the message pair 
(1,1) is transmitted, and let us denote by Ni^i the crossing 
time of the random walk corresponding to the message pair 
(1, 1), note that N < 7Vi_i. Then, we have 

E[Z,{l,l)]=I{X,,X2;Y), 

using Wald's equality (see, e.g., |7|), we get 

E[SiN,^i,l,l)] = IiX,,X2;Y)E[N,^,]. 

For AfiM2 large we can ignore the overshoots and 
E[S{Ni^i, 1, 1)] = (l+e) log(MiM2). Thus, we can conclude 
that 



E[N] < E[Ni^i] R: 

which implies that 

Ri + R2> 



(1 + e) \0giM1M2) 
I{Xi,X2;Y) ■ 

I{Xi,X2;Y) 
l + e 



The joint decoder makes an error when a random walk 
corresponding to a different message pair crosses the threshold 
before {S{n, 1, 1)}. The wrong messages come in three kinds: 

1) {wi,W2) such that w\^ \ and W2 7^ 1, 

2) (w\,W2) such that w\ — \ and W2 ^ 1, 

3) (w\,W2) such that w\^\ and W2 = 1. 

"Here we have Nx=N2 = N. 
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In each case we have 

E[Z^{w^ ^ l,W2 ^ 1)] 



= X! p(2;i)p(^2)p(y)log 

xi,X2,y 

= -D{p{y)\\p{y\xi,X2)) 
< 0, 



piy\xi,x2) 
p{y) 



E[Zi{wi = 1,W2^ 1) 



= X! P(2^i)p(2;2)p(y|a;i)log 



p(y|j:i,X2) 



P(2/) 



p(j/ki,a^2) 



< 2^ Ma;i)p(2;2)( —\ 1) 



= 0, 

£;[Zi(7«i^l,u;2 = l)]<0, 



where we use the fact that logo:: < {x — 1)0 and the last 
inequality follows by symmetryl^ Note that the expectations 
are taken with respect to the joint probability {Xi, X2,Y) 
corresponding to the message pair (wi,W2) considered. 

Thus {{S{n,wi,W2)}n>i ■ {^1,102) ^ (1,1)} are random 
walks with negative drift. For those random walks one can 
show (see, e.g., Q) that the probability of ever crossing a 
threshold T is upper bounded as follows 

Pr(crossing T) < e-^'("'i'«'2)r^ 

where X*{wi,W2) correspond to the unique positive root of 
the log moment generating function of Zi{wi,W2), i.e.. 



Wi,W2)l 



0. 



Therefore, we can upper bound the probability of error by the 
probability that any random walk in {{S{n,wi,W2)}n>i ■ 
(wi,W2) 7^ (I7I)} crosses the threshold T = (1 + 
e) log(MiM2): 

+ AfiM2e-^'("'i^i'"'^^^)^. 
Here we have 

p{y\xi,X2) 



piy) 



^[e2i(-i#i-2#2)] ^ ^ p(a:i)p(x2)p(j;)- 

which implies that A*(wi 7^ 1, W2 7^ 1) = 1. If, in addition 

/(X2;r|Xi) 



1, 



A*K = l,li,2 ^ 1) > 
A*K 7^ 1,^2 = 1) > 



I{XuX2;Y) 
I{X^;Y\X2) 
I{Xi,X2;Y)' 



'^Here "log" denotes the logarithm to the base e. 

"Note that last two inequalities are equivalent to D{p(y\xi)\\p(y)) — 
D{p{y\xi)\\p{y\xi,X2)) < and D{p{y\x2)\\p{y)) - 

D{p(y\x2)\\p{y\xi, X2)) < 0, with p{y\xi) = J2x'2 P(^2)p(!'l^i' ^2) and 

p{y\x2) = T,x' p(^'i)p(yl^i'^2). 



we would have had 

Pe < e^[^l(fli--f(^i;'>'l^2)) ^_ gE[Af](i?.2-/(X2;y|Xi)) 

+ (MiM2)-% 

and thus, by letting e -> and E[N] — > 00, any rate pair in 
T^MAC with a fixed input distribution p{xi)p{x2) would have 
been achievable using this joint decoding rule. Unfortunately, 
in general, A*(wi — 1,W2 ^ 1) and A*(wi 7^ 1,W2 = 1) 
do not satisfy the preceding inequalities. However, we can 
improve this joint decoding scheme by combining it with other 
schemes as explained in the following subsection. 

B. A combined decoding rule 

We will combine the joint decoding rule with the following 
decoding rules. Suppose that the receiver knows which mes- 
sage the second transmitter is sending, then the equivalent of 
the previous rule to decode the message coming from the first 
transmitter is 



Sw2 l*^) 



log 



p{Vi\x\i[wx),X2i{w2)) 

p{yiVit{w2)) 



where p(:yi\x2i{w2)) Y.x\^P^^'\i)P^yiV\^^'Xii{w2))- De- 
note the expression under the summation by 

7/ I N , p{yiVuiwx),X2i{w2)) 

Z.^WX\W2) = log — 

P{yi\X2i\W2)) 

and the summation by S(n^w\\w2) — X]"=i 'ZA^\\'^2), thus 
{5(ri, W\\w2)]n>\ are M\ random walks and the receiver will 
declare the message corresponding to the first random walk 
crossing the pre-determined threshold. Here, we let N\,W2 be 
the stopping time defined by 

Nx,w2 = min{n > 1 : S^^{n) > (1 + e) log Mi}. 

Assuming that the message pair (1,1) is transmitted, we have 

E[Ziil\l)]^IiXi;Y\X2), 

and 

E[Zi{wi 7^ 1\W2 = 1)] 

p{y\xi,x2) 



= 2^ Pixi)p{x2)p{y\x2)log 

xi,X2,y 

= -D{p{y\x2)\\p{y\xi, X2)) 
< 0. 



p(.y\x2) 



As before we can upper bound the probability of error knowing 
that the message ^2 = 1 is transmitted by the probability that a 
random walk corresponding to a different message wi crosses 
the threshold T.^^ = (1 + e) log A/i, which gives 

where X*{wi 7^ l|u'2 = 1) is the unique positive root of 
the log moment generating function of Zi{wi 7^ l|w2 = 1), 
which turns out to be equal to 1. This allows us to conclude 
that 



(1 + e) log Ml 
I{Xi;Y\X2) ' 
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with 



P. 



The same results hold if the receiver knows wi and wants to 
decode W2 (with an interchange of the indexes 1 and 2 on the 
above equations). 

Now, let us remove the assumption that one of the transmit- 
ted message is known by the receiver and combine these de- 
coding schemes as follows. Consider a receiver which runs the 
three preceding decoding rules in parallel and declares the first 
message pair {■wi,W2) for which the corresponding random 
walks have cross the threshold in each decoding scheme. Such 
a decoder will run all the random walks {S{n,wi,W2)}n>i, 
{S{n, Wi\w2)}n>i and {S{n, 'W2\wi)}n>i, and stop when the 
random walks corresponding to one message pair have hit the 
pre-determined threshold in each scheme, that is the decoding 
time of this combined scheme is given by 

Ncomb = min{n > 1 : 3(101, W2) and ni,n2,n3 < n such that 
Sini,wi,W2) > (l + e)log(MiM2) 
S{n2,wi\w2) > (l + e)logMi 
S{n3,W2\wi) > (l + e)logM2}. 

This combined decoder will make an error when the random 
walks corresponding to a wrong message pair will cross the 
given threshold before the correct one in each scheme. In 
regards of what has been said before, the probability of error 
of this combined decoder can be bounded as follows 

assuming that the message pair (1,1) is transmitted. Thus, we 
obtain that 

Pe < {MiA^y + Mr + M^, 

and the probability of error goes to zero, as Mi and M2 get 
large. If we denote by A^i^i, A^i.u,^i and N2^wi=i the crossing 
times of the random walks corresponding to the message (1,1) 
in each of the three preceding schemes, we can see from the 
expression of Ncomb that 



order that for the max in (|3)), depending on which expected 
decoding times is greater: 

/ log Ml 



2,iui = l 



(2) 



At this point, let us remark that since the random walks 
S{ni, 1, 1), S{n2, 1|1) and S^n^, 1|1) concentrate around their 
mean as n becomes large, the respective decoding times also 
concentrate around their mean as the thresholds get large, this 
is show in Appendix ICl 

Using this we see that as the crossing thresholds get 
large, each of the three preceding decoding times concentrates 
around their mean and thus the expectation in (|2) becomes 
approximately equal to the maximum of the three expected 
decoding times, hence for Mi and M2 sufficiently large, we 
have 

E[N,o^b] < miix{E[Ni^i],E[Ni^^,=i],E[N2^,,,^i]). (3) 

Therefore, for Mi and M2 large and for e — > 0, this random 
code approaches one of the following rate pair (in the same 



4ogMi +logM2 



/(Xi,X2;y), 

logM2 
log Ml +logM2 



I{XuX2;Y)), 



(/(Xi;r|X2), 
.logM 

4ogM; 



log M2 



I{Xi;Y\X2)), 



log Ml " 

1 1 \ 

^I{X2;Y\Xi),I{X2:Y\Xi)). 



Note that, according to the values of the ratio , any 

rate pair in TZmac with a fixed input distribution p(xi)p{x2) 
is achieved. For example, if Mi — M2 this coding scheme 



achieves the rate pair (L!2^l2^^iKl _ 



I{Xi,X2:Y) 



— (5) , and if 



li^ = ^rrMpr *^ '""'^ P^'"" ^'^^'^^^'^ {i{Xi;Y\x2) - 

S, I{X2;Y) — 6), for some S > 0. Hence we have shown 
the existence of a variable length code achieving a certain 
rate pair in TZmac, without a previous agreement between 
the transmitters. 



C. A suboptimal decoding scheme 

We conclude this section by presenting a suboptimal scheme 
that uses only single-user decoders, which is nothing but the 
successive decoding scheme adapted to variable length codes. 
Consider a receiver that decodes each message separately, 
treating the signal of the other transmitter as noise. In view 
of the preceding decoding rules, to decode the message of the 
first transmitter, we consider the following rule 



n 



Sin) 



max > log 7 — r . 



where p{yi\xi,{wi)) = Eaj^, -P(4j)p(2/da^i»(^i)> ^i)- De- 
note the expression under the summation by 

p{yt\xu{wi)) 



Zi{wi) = log: 



and the summation by S'(n, wi) = X]"=i ^i(^i)' '^^en 
{5(n, wi)}„>i are Mi random walks and the receiver will 
declare the message corresponding to the first random walk 
crossing the pre-determined threshold. Hence, we let iVi be 
the stopping time defined by 

A^i = min{n > 1 : S{n) > (1 + e) log A/i}. 

Assuming that the message pair (1,1) is transmitted, we have 

E[Zi{l)]=I(Xi-Y), 

and 

p{y\xi) 



E[Zi{wi^l)] = y p{xi)p{y)\og 



p{y) 



= -D{piy)\\piy\xi)) 
< 0. 

As before we can upper bound the probability of error by the 
probability that a random walk corresponding to a different 
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message crosses the threshold Ti = (1 + e)logA/i, which 
gives 

where X*{wi ^ 1) = 1 is the unique positive root of the log 
moment generating function of Zi{wi ^ 1). This allows us 
to conclude that 



Pe < Ml 



and for Mi large 



E[Ni] < 



+ log Ml 



The same analysis apply to the decoding of the message sent 
by the second transmitter Thus, any rate pair (i?i,i?2) such 
that Ri < I{Xi-Y) and i?2 < I{X2;Y) is achievable using 
this strategy. 

Now let us improve this decoding scheme by noting that 
as soon as one of the two messages are decoded, the receiver 
can remove (the effect of) the signal of the corresponding 
transmitter from the received signalj^ Assume, without lost 
of generality, that the message from the second transmitter 
is decoded earlier, then for decoding the message of the 
first transmitter, the receiver can use the rule S^^ previously 
analyzed. A receiver using this improved decoding rule is able 
to decode the message coming from the first transmitter at 
time A^i,u)2 ™d achieve any i?i < I{Xi;Y\X2)- However, 
this decoding time might "virtually" happen before N2, the 
decoding time of the message coming from the second trans- 
mitter Thus, the actual decoding time of the message sent 
by the first transmitter is given by max(A^i^^2, A^2), which 
implies that in order to approach the rate pair {Ri,R2) — 
{I{Xi;Y\X2), I{X2\Y)), the ratio |^|^ must be sufficiently 
large. 

VII. Concluding Remarks 

An explicit code approaching the transmission rates of the 
random coding schemes presented here remains to be found. 
Nevertheless, for the suboptimal scheme presented in Section 
IVI-CI and for certain multiple-access channels, it might be 
interesting to consider coding schemes based on fountain 
codes. Notice that for the Gaussian multiple-access channel a 
practical decoding scheme using rateless codes and successive 
decoding has been introduced in ifTol . in the particular case 
where the cardinality of the set of messages is the same for 
each transmitter and when the decoding times are equal and 
deterministic. 

Observe that our coding schemes can easily be adapted to 
work when more than two users are simultaneously transmit- 
ting and when the channel statistics are unknown to the trans- 
mitters, as long as it is known to the receiver Furthermore, 
note that, in lfT4l . lfT6l and ifTTl . variable length codes are 
successfully used in combination with different extension of 
the maximum mutual information (MMI) decoder, to univer- 
sally communicate over a class of unknown channels. In the 

-"in case of an additive channel this is done by a subtraction which adds 
no complexity. 



context of universal coding over a multiple-access channel, 
the perfect mutual information decoder used in the random 
coding schemes proposed here may be replaced with the MMI 
decoder, as done for the decoding strategies described in the 
above references. 

Finally, we remark that the setup of this paper can be 
extended to allow a noiseless and instantaneous feedback from 
the receiver to the transmitters. This requires to make each xu 
and X2i dependent of the past received values Y^^^. In this 
setting, we can prove the following outer bound on C,.i,r2' if 
a rate pair (i?i,i?,2) is in Cri,r2> then 

Ri<nI{Xi;Y\X2) + {l-ri)C\ 
R2<r2l{X2;Y\Xi) + {l-r2)C2 
sRi + i?2 < r2l{Xi,X2;Y) + s(l - ri)Ci + (1 - r2)C2, 

for some joint distribution p{xi,X2)p{y\xi,X2)- This outer 
bound can easily be derived using the ideas developed in the 
proof of Theorem |3] (without the introduction of the time- 
sharing random variable Q). This provides an extension of the 
outer bound on the capacity region of a multiple -access with 
feedback described in |11|, to the case where the receiver can 
decode the messages at different instants of time. 

Appendix A 
Proof of Lemma|2] 

Let ^ 1{N < i < Ni}^ md consider 

I{Wi;Y^l,\Y'',W2) 

= /(W^i; Fia, Cl, • • • , >^n^n, • • • l^""^, W2) 

= nWi;^i\Y^,W2) 

+ IiWi;Yi^i\^i,Y^,W2) + --- 

+ IiWi;UiYO'''\C'\Y^ ,W2) 

+ I{Wi;Y^U{yO"~\C, Y^,W2) + --- 

00 

= Y,i(Wi;^.my-\r\Y'' ,W2) 

00 

+ Y,i{WuY,^,my-\e,Y'',w2), 

i=l 

where we use the chain rule for mutual information to obtain 
the second inequality. 

The first summation can be bounded as 

00 00 

Y,HWl;^^my-\^\Y'',w2) < ^^(e.if-') 

4=1 i=l 

= i?(ei,6,---) 

H{Ni - N) 
< \og{eE[Ni-^ N]). 

where the last inequality is proved in ||4] and (|5] §1.3], as 
mentioned in the proof of Lemma 1 . 

-'As before, we define Yi^i as being equal to if Af < i < TVi and equal 
to H otherwise, where H denotes a symbol distinct from any of the letters in 
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For the second summation, we can write 

I{W,-Y,U\{Y0'-\C,Y'',W2) 

- X2^^^, {Y^y-\c, r^, M^2, w^i) 

H{Yi£,i\X2i£,i,£,i) — H {Yi^i\Xii^i, X2iS,i, 

= Pr(e. = 1)[H{Y,\X2^,^^ = 1) 

-H{Y,\Xu,X2^A^ = l)] 
= Pl{N < I < Nl)I{Xu;Y,\X2^,(^ = 1) 

< Pr(7V < i < Ni)Ci, 

where in (a) we use the fact that given {Xu, X2i), Yi is inde- 
pendent of the past received values and of (Wi, W2). The last 
inequality follows since p(2/i|xH, X2i, 6 = 1) = p{yi\xii,X2i) 
and by the definition of Ci. Thus, we get 

liWuY^'^^Y'' ,W2) 

00 

< log(e£:[iVi - A^]) + X! < * ^ ^i)*^! 

1=1 

= log(eS[iVi - N]) + £;[7Vi - iV]Ci. 
The second inequality follows in a symmetric way. 

Appendix B 

Here, for channels with a zero-error capacity equal to 
zero, we argue that the definition of achievability given by 
Definition [T] is equivalent to the following alternate definition 
of achievabihty. 

Definition 3: A rate pair {Ri,R2) is said to be achiev- 
able if there exists a sequence of {Mi, M2, Ni, N2) variable 
length codes with E[Ni] and E[N2] increasing such that 

liminf£;[jVi]_^oo,£;[A'2]^oo Pe = 0. 

To see this, take the best variable length code (the one that 
achieves the minimum P^) with a finite E[Ni] and/or E[N2] 
such that e > Pe > ei, for some e > ei > 0. Note that ei 
could not be equal to zero otherwise this would imply that 
the zero-error capacity of the channel is different than zero. 
Hence, we can find an £2 > such that ei > 62- Therefore, 
in order to achieve P^ < t2, we need to increase E[Ni\ or 
E[N2] - Repeating this argument, we see that E[Ni\ and E[N2] 
need to be arbitrarily large in order to achieve an arbitrary low 
probability of error. 

Appendix C 

In this appendix, we show that for a random walk with 
a positive drift, the time spend to hit a positive thresh- 
old concentrates around its mean. Consider a random walk 
S(n) = X]r=i where {Zi] are i.i.d. random variables with 
E[Zi] > 0, and let N be the first time at which S{n) crosses 
a given threshold T* > 0. By Wald's equality we know that 
for large T*, E[N] « and here we want to show that 



with high probabiUty E[N]{1 - e*) < N < E[N]{1 + e*), for 
some e* > 0. But, the following clearly holds 

Pr(A^ > E[N]{1 + e*)) < PiiS{E[N]{l + e*)) < T*), 

where the RHS corresponds to the probability that the random 
walk is under the threshold at time E[N]{1 + e*), which is a 
large deviation event, since we have 

Pi{S{E[N]{l + e*)) <T*) 

= Pr( ^ J -S(E\N](l + £*)) < -^i^) 

^E[N]{l + e*) ^ ^ ^ '''' - (1 + e*)^ 

where c(e*) is some constant depending on e*. The same 
conclusion can be obtained for the lower bound, thus as T* 
gets large, N concentrates around its mean. 
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